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OPTIMAL SCALING FOR THE TRANSIENT PHASE OF 
THE RANDOM WALK METROPOLIS ALGORITHM: THE 

MEAN-FIELD LIMIT 
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We consider the random walk Metropolis algorithm on R" with 
Gaussian proposals, and when the target probability is the n-fold 
product of a one dimensional law. In the limit n — > oo, it is well- 
known (see [22]) that, when the variance of the proposal scales in- 
versely proportional to the dimension n whereas time is accelerated 
by the factor n, a diffusive limit is obtained for each component of the 
Markov chain if this chain starts at equilibrium. This paper extends 
this result when the initial distribution is not the target probability 
measure. Remarking that the interaction between the components of 
the chain due to the common acceptance/rejection of the proposed 
moves is of mean-field type, we obtain a propagation of chaos result 
under the same scaling as in the stationary case. This proves that, 
in terms of the dimension n, the same scaling holds for the tran- 
sient phase of the Metropolis-Hastings algorithm as near stationarity. 
The diffusive and mean-field limit of each component is a nonlinear 
diffusion process in the sense of McKean. This opens the route to 
new investigations of the optimal choice for the variance of the pro- 
posal distribution in order to accelerate convergence to equilibrium 
(see [12]). 



1. Introduction. IVIany IVlarkov Chain IVlonte Carlo (IVICMC) meth- 
ods are based on the IVIetropoUs-Hastings algorithm [15, 11]. Let us recall 
this well-known sampling technique. Let us consider a target probability 
distribution on with density p. Starting from an initial random variable 
Xq, the Metropolis-Hastings algorithm generates iteratively a Markov chain 
{Xk)k>Q in two steps. At time k, given X^-, a candidate Yfc+i is sampled us- 
ing a proposal distribution with density q{Xk,y). Then, the proposal Y^+i 
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is accepted with probability a(Xfc,Yfc+i), where 

p{y)q{y,x) 



a{x, y) = 1 A 



p{x)q{x,y)' 



Here and in the following, we use the standard notation a Ab = min(a, 6). 
If the proposed value is accepted, then X^+i = 1^+1 otherwise X^+i = X/^. 
The Markov Chain (Xfc)fc>o is by construction reversible with respect to 
the target density p, and thus admits p{x) dx as an invariant distribution. 
The efficiency of this algorithm highly depends on the choice of the proposal 
distribution q. One common choice is a Gaussian proposal centered at point 
X G M" with variance u^Idnxn: 

1 ( \x-y^'^ 

li^^y) = /o- exp 



(27rcj2)'^/2 " ^ V 2a2 

Since the proposal is symmetric {q{x,y) = q{y,x)), the acceptance proba- 
bility reduces to 

p{y) 



(1.1) aix,y) = lA 



p{x) 



Metropolis-Hastings algorithms with symmetric kernels are called random 
walk Metropolis (RWM) algorithms. 

The choice of the variance is crucial for the performance of the RWM 
algorithm. It should be sufficiently large to ensure a good exploration of the 
state space, but not too large otherwise the rejection rate becomes typically 
very high since the proposed moves fall in low probability regions, in par- 
ticular in high dimension. It is expected that the higher the dimension, the 
smaller the variance of the proposal should be. The first theoretical results to 
optimize the choice of cj^ in terms of the dimension n are due to G. Roberts, 
A. Gelman and W.R. Gilks in [22]. The authors study the RWM algorithm 
under two fundamental (and somewhat restrictive) assumptions: (i) the tar- 
get probability distribution is the n-fold tensor product of a one-dimensional 
density: 

n 

(1.2) p{x) = \{eM-y{x^)) 

i=l 

where x = {xi, . . . , Xn) and / exp(— y) = 1, and (ii) the initial distribution 
is the target probability: 

~ p{x) dx. 
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The superscript n in the Markov chain {X^)k>o exphcitly indicates the de- 
pendency on the dimension n. Then, under additional regularity assump- 
tions on V, the authors prove that for a proper scaling of the variance as a 
function of the dimension, namely 



C^n = — 

n 

where I is a fixed constant, the Markov Process (^X^^^^ (where X^'"' G M 

denotes the first component of X^ G M") converges in law to a diffusion 
process: 

(1.3) dXt = ^TmdBt - h{l)]^V'{Xt) dt, 



where 



(1.4) h{l) = 2lH l^-^-^j and/ = J {V'f exp{-V). 

Here and in the following, [-J denotes the integer part (for y G M, [y\ G Z 
and [yj < y < [yj + 1) and $ is the cumulative distribution function of 
the normal distribution (<I>(x) = J^^exp{—y'^/2)dy). The scaling as a 
function of the dimension of the variance and of the time are indications on 
how to make the RWM algorithm efficient in high dimension. Moreover, a 
practical counterpart of this result is that / should be chosen such that h{l) 
is maximum (the optimal value of I is I* = in order to optimize the 

time scaling in (1.3). This optimal value of / corresponds equivalently to an 
average acceptance rate 0.234 (independently of the value of I): for 1 = 1*, 

( 

a{x, y)p{x)q{x, y) dx dy = 2^ ~ 0.234. 



Thus, the practical way to choose a'^ is to scale it in such a way that the 
average acceptance rate is roughly 1/4. 

There exists several extensions of such results for various Metropolis- 
Hastings algorithms, see [20, 21, 16, 17, 3, 4, 6], and some of them relax 
in particular the first main assumption mentioned above about the product 
form of the target distribution, see [8, 7, 1, 2, 5]. Extensions to infinite 
dimensional settings have also been explored, see [14, 18, 5]. 

All these results assume stationarity: the initial measure is the target 
probability measure. To the best of the authors' knowledge, the only work 
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which deals with a non-stationary case is [9] where partial scaling results 
are obtained for the RWM algorithm with a Gaussian target. In the recent 
paper [17, p. 3], the authors also mention [9] as the only paper dealing with 
the non-stationary situation. 

The aim of the present article is to show that, for the RWM algorithm, 
using the same scaling for the variance and the time as in the stationary 
case (namely o"^ = ^ and considering (^X^^^^ ), one obtains in the limit 

n goes to infinity the nonlinear (in the sense of McKean) diffusion process: 
(1.5) 

dXt = T'/\E[{V\Xt))%E[V'\Xt)])dBt-g{E[{V\Xt))%E[V''{Xt)])V'{Xt)dt, 

where, for a G [0, +oo] and 6 G M, 
(1.6) 



r(a,6) 



[-^a) + l'-^^ [^[^a-V-a))^^a^ (0, +00), 
Y if a = +00, 
— — if a = 0, 



where = max(6, 0), and 
(1.7) g{a,b) 



^-2 



Z2e^^$(^/(^^-^jj ifaG(0,+oo) 
if a = +00, 



,l{6>o}'^e 2 ifa = 0. 

Notice that we will assume V" to be bounded, so that the coefficients in (1.5) 
are well defined. This convergence result is precisely stated in Theorem 1 
below and can be seen as a mean-field limit. We would like to mention 
that another (different in nature) mean-field limit is considered in [8] in 
the context of optimal scaling: the limit is obtained, under the stationarity 
assumption, for a target measure which admits some mean-field limit as 
n — )• cxD. 

Our convergence result generalizes the previous analysis in [22] which 
was limited to the stationary case (namely Xq is distributed according 
to p{x)dx). In particular, in the stationary case, we recover the dynam- 
ics (1.3). It also generalizes results from [9] to non-Gaussian targets. 

The proof is based on a classical technique to prove propagation of chaos [23] . 
We first show the tightness of the empirical distribution. Then we pass to 
the limit in a martingale problem, which is the weak formulation of (1.5). 
Notice that such a weak formulation has also recently been used in [14] to 
deal with the stationary case. 
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This new result opens the route to new investigations of the optimal choice 
for the variance of the proposal distribution, by precisely taking into account 
the transient regime (when the Markov chain is not yet at equilibrium). It 
shows for example how to scale properly the variance and the number of 
samples as a function of the dimension, at least for a product target. A more 
detailed analysis of the longtime behavior of the nonlinear diffusion (1.5) and 
of the practical counterparts of this convergence result will be the subject 
of a forthcoming work [12]. 

The paper is organized as follows. In Section 2 we state our main con- 
vergence result, and we present a formal derivation of the limiting diffusion 
process. Then, Section 3 is devoted to the rigrous proof of the convergence 
result. 

2. The main convergence result. Let us first present the precise 
statement for the main convergence result, together with a formal derivation 
of the limiting process. 

2.1. Notation and convergence to the diffusion process. We consider a 
Random Walk Metropolis algorithm using Gaussian proposal with variance 
£7^ = ^, and with target p defined by (1.2). The Markov chain generated 
using this algorithm writes: 

(2.1) = + -^GUiU+i, l<i<n 

with 



n 



where (G^)j,fc>i is a sequence of independent and identically distributed 
(i.i.d.) normal random variables, independent from a sequence {Uk)k>i of 
i.i.d. random variables uniform on [0,1]. We assume that the initial posi- 
tions (Xq'", . . . , Xq '") are exchangeable (namely the law of the vector is 
invariant under permutation of the indices) and independent from {G\)i^k>i 
and {Uk)k>i- Exchangeability is preserved by the dynamics: for all A: > 1, 
(X^'", . . . , X^'") are exchangeable. We denote by TJ! the sigma field gener- 
ated by (Xq^'", . . . , Xq"'") and {Gj,..., Gf, Ui)i<i<k. 
In all the following, we also assume that 

{y is a C'^ function on R 
with bounded second and third order derivatives. 
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For t > and i G {1, . . . , n}, let 

Vr = {\nt\ - nt)x\:^^ + {nt - [nt\)X\^Z^ 

be the linear interpolation of the Markov chain obtained by rescaling time 
(the characteristic time scale is and Y^'J^^ = X^,'"", V/c G Z.). Here and 
in the following [•] is the upper integer part (for ?/ G M, [y] G Z and 

\y}-i<y< \y}). 

Let us define the notion of convergence (namely the propagation of chaos) 
that will be useful to study the convergence of the interacting particle system 
((1^^'", . . . ■,Y^'^)t>o)n>i in the limit n goes to infinity. 

Definition 1. Let E he a separable metric space. A sequence (xi , ■ ■ ■ , Xn)n>i 
of exchangeable E"^ -valued random variables is said to be v -chaotic where v 
is a probability measure on E if for fixed k G N*, the law of (xi, • • • ,Xk) 
converges in distribution to v®^ as n goes to oo. 

We are now in position to state the main convergence result. 

Theorem 1. Let m be a probability measure on R such that J^{V')'^{x) m{dx) < 
+00. Let us also assume (2.2). If the initial positions (Xq'", . . . , Xq '"')„>i 
are m-chaotic and such that lim„^oo E[(y (Xq'"))^] = f^{V')'^{x) m{dx), 
then the processes ((1^^'", . . . , ^"'"')t>o)n>i CLf^ P-chaotic where P denotes 
the law (on the space C(M+,IR) of continuous functions with values in of 
the solution to the nonlinear stochastic differential equation in the sense of 
McKean (for which strong and weak existence and uniqueness hold) 

Xt = i+ f T'/\E[{V'{X,))%E[V"{Xs)])dBs 

(2.3) 

- / G{E{{V'{X,))%E[V"{Xs)])V'{Xs)ds 
Jo 

where T and Q are respectively defined by (1.6) and (1.7) and {Bt)t>i is a 
Brownian motion independent from the initial position ^ distributed accord- 
ing to m. 

Let us make a few remarks on this result. First, concerning the assumption 
on the initial position (Xq'", . . . , Xq '"")„>!, we note that it is satisfied for in- 
stance when the random variables Xq'", . . . , Xq are i.i.d. according to some 
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probability measure m on M. Second, notice that the results of Theorem 1 
does not require exp(— y) to be integrable. Finally, according to [10] (see 
Proposition 10.4 p. 149 and Theorem 10.2 p. 148), under the assumptions 
of Theorem 1, the piecewise constant processes {{X^^^, . . . , X^^^)t>o)n>i 
are also P-chaotic when the space of cadlag sample-paths from [0, +oo) is 
endowed with the topology of uniform convergence on compact sets. 

In the following, we will also need the infinitesimal generator associated 
to (2.3). For a probability measure on R, {fi, V") is well defined by bound- 
edness of V" , and {fi, (V')'^) is also well defined in [0, +oo]. Here and in the 
following, the bracket notation refers to the duality bracket for probability 
measures on M: for fj, a probability measure and (p a bounded measurable 
function, 



{fi,4>) = / (l){x)fi{dx). 
Jm. 

The differential operator associated to (2.3) is defined by: 
(2.4) 

LM^) = ir((^, (Vf), (/i, V'Wix) - g((^, (Vf), (/i, v"))v'{x)^'{x). 

More precisely, if Xt satisfies (2.3), then for any test function ip, ip{Xt) — 
Jq Lp^ip{Xs) ds is a martingale, where Pt denotes the law of Xt: for any 
s <t, 

(2.5) E (^ifiXt) - J' LpMXr) dr | = ^{X^), 

where J-g = a{Xr,r < s). Actually, as explained in Section 3.1 below, this 
martingale representation characterizes the distribution (over C(]R+,M)) of 
solutions to (2.3): solutions to (2.5) are distributions of solutions to (2.3), 
and reciprocally. 

In addition to the previous convergence result, we are able to identify the 
limiting average acceptance rate. 



Proposition 1. Under the assumptions of Theorem 1, the function 
t ^ E 



p(^Lntj+ii-^rn*j) - ^m[{v'{Xt)nE[v"{Xt)]) 



converges locally uniformly to and in particular, the average acceptance 

rate t H> ¥{Aint\+i) converges locally uniformly tot^ acc{E[{V' {Xt))'^],E[V" (Xt)]) 

where for any a > and 5 G M, 

(2.6) acc(a,6) = ^^. 
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2.2. Relation to previous results in the literature. Let us discuss how 
this theorem is related to previous results in the literature. First, when 
Z = J^e~^^^^dx < +00, our convergence result generalizes the scaling limit 
for the random walk Metropolis-Hastings algorithm stated in the early pa- 
per [22] under the restrictive assumption that the vector of initial positions 
(Xq'", . . . , Xq is distributed according to the target distribution p{x) dx. 
In this case, it is clear that for all n,k £ N, {Xj^'^, . . . , X^'") is distributed 
according to p{x) dx. Moreover, we have the following result: 

Lemma 1. Assume that (2.2) holds, and that J^e~^^^^dx < 00. Then 

[ {V'{x)fe-^^'^Ux = f y"(2;)e-^(^)dx < +00. 
Jr Jr 

Proof. Indeed, since |V^'(x)| < |F'(0)| + ||F"||ook| there exists a se- 
quence {xn)n of negative numbers tending to —00 and a sequence {yn)n 
of positive numbers tending to +00 such that lim„_^+oo 

\V'{yn)\e-^^^"'^ = 0. By integration by parts, f^'^{V'{x) fe-^^''Ux = y'(x„)e-^(^")- 
^'(2/n)e"^(^") + /J" y" (x)e-^(^)(ix. Taking the limit n 00 thanks to 
monotone convergence in the left-hand-side and thanks to Lebesgue's the- 
orem and boundedness of V" in the integral in the right-hand-side, one 
concludes that J^{V'{x)fe-^^'^Ux = J^V"{x)e-^^''Ux < +00. □ 

One deduces that for each t > the solution Xt of (2.3) is distributed ac- 
cording to exp{—(3V{x)) dx so that {Xt)t>Q also solves the stochastic dif- 
ferential equation (1.3)-(1.4) with time-homogeneous coefficients (here, we 
use the fact that r(J,I) = 20(1,1) = h{l) where / = J^{V'{x)fe-^^'''^dx = 
/jjy"(3;)e-^(^)d2;). Notice that our convergence result requires more regu- 
larity but less integrability than in [22, Theorem 1.1] where the log-density 
—V is assumed to be with a bounded second order derivative and such 
that /jj(y')^exp(-y) < +00. 

Second, we also recover results from [9], where the authors consider a 
non-stationary case, but restrict their analysis to Gaussian distributions: 
V{x) = In this case, the function V" is constant equal to 1 and, for Xt 
solution to (2.3), one obtains that 

^E[X2] = r(E[X2], 1) - 2E[X2]g(E[X2], 1) 

= ( — 1\ + (1 _ 2E(x2)) fe'^^^^^ i I i - xlmrS] . 
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This is indeed the ordinary differential equation satisfied by the deter- 
ministic function obtained as the hmit (when n — ?• oo) of the processes 

X^r=i(^[^j in [9, Theorem 1]. More precisely, the proof of our 

Theorem 1 and especially the estimate (3.24) below ensure that for any 



t > 0, hm„^no E 



h Er=i(^;„"j)' - nX?] = O (where Xt solves (2.3)). 



2.3. A formal derivation. Before going into the details of a rigorous 
proof, let us explain how this limit diffusion process can be formally de- 
rived. 

First, let us make precise how to choose the scaling of CT fi clS 8j function 
of n. The idea (see [21]) is to choose (T„ in such a way that the limiting 
acceptance rate (when n — >• oo) is neither zero nor one. In the first case, 
this would mean that the variance of the proposal is too large, so that all 
proposed moves are rejected. In the second case, the variance of the proposal 
is too small, and the rate of convergence to equilibrium is thus not optimal. 
In particular, it is easy to check that an should go to zero as n goes to 
infinity. Now, notice that the limiting acceptance rate is: 



= E i^e'^UiV'ixrWGi^,+V''ixr)i) A 1 1 + 0{nal) 



(2.7) =^T{an,hn) + 0{nal), 

where a, = ^ Er=i(^'(4'"))' ^nd 6„ = ^ EHi ^"(4'")- To obtain (2.7), 
we used an explicit computation of the expectation with respect to the Gaus- 
sian measure, see (A. 5) below (with a = 0). From this expression, assuming 
a propagation of chaos (law of large number) result on the random variables 
(X^'"')i<j<„, one can check that that the correct scaling for the variance is 
cr^ = ^ in order to obtain a non-trivial limiting acceptance rate (see Propo- 
sition 1 above). More precisely, if — )• and hn — )• 0, then the acceptance 
rate goes to 1 (by continuity of F at point (0,0), see Lemma 2 below). If 
an ~ an'^ and 6„ ~ I3rf, (for some e > 0), then the acceptance rate goes to 
if ^ > and to 1 if ;5 < 0. 
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Using the scaling cr^ = ^, we observe that, for a test function 



(2.8) 
We compute: 



-l/2^ 



(2.9) 



where 



1 " 



n ■' — ' 

1=1 



denotes the empirical distribution associated to the interacting particle sys- 
tem. The equation (2.9) is a consequence of (A. 3) below. A more detailed 
analysis (see Proposition 7 below) shows that the remainder is of order 
©(n-^/^). This is one of the most crucial estimate to prove rigorously the 
convergence result. For the diffusion term, we get 



1 ^2 y:7=iiv{xr)-vix--+^Gi^,)) 



A 1 



k 



(2.10) 



(n 



To obtain (2.10), we again used an explicit computation, see (A. 5) below. 

By plugging (2.9) (with the remainder of order 0{n~^^^)) and (2.10) 
into (2.8), we see that the correct scaling in time is to consider Y^'^ such 
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that y^*y^ = X^f^^, and we get: 

^ = ^ {Y,!p) - ^' {y,):) '-y {yir^ q {i^^im^ (yn y")) 

+ 1^" (^^/«) ^ (^^^/- ^^^/- + 

where is defined by (2.4), and /x" denotes the time-marginal of /x" defined 
by (3.1) (for /c e Z, //^^^ = z^^). This can be seen as a discrete-in-time version 
(over a timestep of size 1/n) of the martingale property (2.5) (which is 
actually a characterization in law of a solution to (2.4), as explained below). 
Thus, by sending n to infinity and assuming that a law of large number 
holds for the empirical measure uj^, we expect Y^'^' to converge to a solution 
to (2.3). The aim of Section 3 is to give a proof of this formal derivation. 

3. Proof of the main result. The next sections are devoted to the 
proof of Theorem 1, which is divided into three steps. In Section 3.1, we first 
introduce a weak formulation of (2.3) (namely the associated martingale 
problem) and we prove existence and uniqueness for this problem. Then, in 
Section 3.2, we check the tightness of the sequence of laws of the processes 
(y/'"')t>o. According to [23], this is indeed equivalent to the tightness of the 
sequence (vr")„ of the laws of the empirical measures 

n 

(3.1) /i" = - <5y,,n 

^ 1=1 

considered as random variables valued in the space V{C) of probability mea- 
sure on the set C of continuous paths from [0, +oo) to M. The space C is 
endowed with the topology of uniform convergence on compact sets and 
V{C) with the corresponding topology for convergence in distribution. The 
third and last step consists in checking that the limit of any convergent sub- 
sequence of ('7r")„ is concentrated on the solutions of the martingale problem. 
This is done in Section 3.3 which concludes the proof of Theorem 1. Finally, 
Section 3.4 is devoted to the proof of Proposition 1. 

The proofs rely on the following Lemma which gives some basic properties 
of the functions T and Q. 
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Lemma 2. The function T is continuous on [0, +oo] x M and such that 

(3.2) inf r(a,6)>0, 

{a,b)e [0,+oo] X [inf V" ,sup V"] 

(3.3) 3C < +00, V(a, b) and {a, h') G [0, +oo] x M, 

|r(o, h) - r(a', h')\<C (\h' - b\ + \a' - a\ + - 

The function Q is continuous on {[0, +oo] x M} \ {(0, 0)} and such that 

(3.4) V(a, h) G [0, +oo] x M, V«^(a, &) < V , 

(3.5) 3C < +00, V(a, 6) and (a , 6') G [0, +oo] x [inf V" , sup V"], 
{^^^/d)\G{a,b)-G{a\b')\ 

< C (\b' -b\ + \a' -a\ + iVa' - ^/a\ 



Last, for all (a, b) G [0, +oo] x M, < g{a, b) < r(a, b) <l'^. 

Notice that Q is indeed discontinuous at point (0, 0) since limh^Q+ ^(0, b) / 
^(0, 0). The proof of this Lemma is given in Appendix A. 

3.1. The martingale problem. 

Definition 2. Let (lj)j>o denote the canonical process on C. A prob- 
ability measure P G V{C) with time-marginals {Pt)t>Q solves the nonlinear 
martingale problem {MP) if Pq = m and 

dcf /■* 

V99 G C^(M), M^' = <f{Yt) — / Lp^(p{Ys)ds is a P -martingale. 

Jo 

This martingale problem is the weak formulation of the nonlinear stochas- 
tic differential equation (2.3). Indeed the law of any solution of (2.3) solves 
(MP). Conversely, when P solves (MP), one easily checks by Paul Levy's 

ft dY,+g{{<Ps,{V'f),{Ps,V"))V'{Ys)ds 

^r{{<p,,{V'r),{Ps,v")) 
is a P-Brownian motion. Thus, this implies the existence of a weak solution 
with law P for the stochastic differential equation 
(3.6) 

X[ = C+ f T^'\{Ps, {V'f), {Ps, V"))dBs- f Om, (Vf), {Ps,V"))V'{X^)ds. 
Jo Jo 

For fixed time-dependent coefficients r^^^{{Ps, {V')^), (P„ V")) and g{{Ps, {V')^), {Ps,V")), 
by boundedness of Q on [0, -|-oo] x [inf V" , sup V"] (see Lemma 2 above) and 



characterization (see [13, Theorem 3.16 p. 157]) that [ jSt = /q 
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Lipschitz continuity of V' , it is standard to check that trajectorial unique- 
ness holds for this (hnear in the sense of McKean) stochastic differential 
equation. As a consequence, by the Yamada Watanabe theorem (see [13, 
Proposition 3.20 p. 309, Corollary 3.23 p. 310]), this linear stochastic differ- 
ential equation admits a unique strong solution and the law of this solution 
is P. In conclusion, one may associate a strong solution to (2.3) with law P, 
to any solution P of the nonlinear martingale problem (MP). 

Notice that the two next sections will ensure existence for {MP) and 
(2.3). Uniqueness is ensured by the following proposition. 

Proposition 2. For any probability measure m on M, uniqueness holds 
for the nonlinear martingale problem (MP) and trajectorial uniqueness holds 
for the stochastic differential equation (2.3). 

Combined with the results of the two next sections, this ensures that 
(7r")„ converges weakly to 6p where P denotes the unique solution of (MP). 
According to [23] this is equivalent to the P-chaoticity of the processes 

((y/'",...,y,"'"),>o)„>i. 

To prove Proposition 2, we need the following technical Lemma. 
Lemma 3. For any solution {Xt)t>o of (2.3), 

VO < s < t, E[{Xt - X,)2] < 2f {t-s)+ (^f supCl/")^ ^ ^) " 

Moreover, if (m, (F')^) < ^^en t E[{V' (Xt))"^] is locally bounded. If 

{m, {V')^) = +00, then Vt > 0, E[{V'{Xt))^] = +oo. 

Proof. Let {Xt)t>o solve (2.3). Then for < s < t, 



E[{Xt - Xs)'] <2E 



T'/\n{V\Xr))%E[V"{Xr)])dB, 



+ 2(i-s) / g\E[{V'{Xr))%E[V"{Xr)m[{V'{Xr))^]dr 



{t - s)' 



<2/2(t - s) + 2 (^/Sup(0+ V ^) 



where we used the boundedness properties of F and y/aQ{a,b) stated in 
Lemma 2. 



14 



B. JOURDAIN, T. LELIEVRE AND B. MIASOJEDOW 



One easily deduces the properties of i i— )■ E[(y'(Xt))^] since 




with ^ distributed according to m. 



□ 



We are now in position to prove Proposition 2. 

Proof of Proposition 2. By the discussion following Definition 2, we 
know that, for a given Brownian motion Bt and initial condition ^, one may 
associate a strong solution to (2.3) with law P to any solution P of the non- 
linear martingale problem (MP). Therefore, to get uniqueness of solutions 
to {MP), it is enough to prove trajectorial uniqueness for (2.3). Let {Xt)t>o 
and {Xt)t>o denote two solutions of this nonlinear stochastic differential 
equation, with the same initial condition, and driven by the same Brownian 
motion. If (m, {V')'^) = +oo, then by Lemma 3 and since r(oo, b) = ^ and 



Q{oo,b) = 0, these two processes are equal to (Xq + -yi] . This proves 



trajectorial uniqueness. 

Let us now assume that (m, {V')'^) < +oo. By Lemma 3, i i— ?• E[(Xt — 



Xt?] = mXt -Xo- {Xt - Xo)f] and t ^ E[{V'iXt)f] V E[(y' (Xi))^] are 



r, = riE[iv'iXs))%nv"iXs)]), f, = r{E[iv'iXs))%nv"iXs)]) 



Gs = g{E[{V'{Xs))%E[V"{Xs)]), Qs = Qmv'{Xs))\E[V"{Xs)\). 



Computing {Xt — Xt)'^ by Ito's formula and taking expectations, then using 
(3.2) and the boundedness of G and V" , last using (3.3) and (3.5) and 




locally bounded. In order to simplify the notation, let us denote 



and 
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Young's inequality, one obtains 



E[{Xt - Xt 

+ 2E 
1 

< 



{Qs V\Xs) - Gs V\Xs)) {Xs - Xs)ds 



4infr 



+ 2 / \Q. 



Ts-VA ds + 2f\\V"' 



A 1/2 
1 1 1 



n{V\Xs)f] A niy'{Xs)f]) ^ " W}/\Xs - Xsf]ds 

< C I E[{Xs - Xsf] + K^[V"{X,) - V"{Xs)] 
Jo 

+ E\V\X,)f - {V'{X,)f] + (Ei/2[(y'(x,))2] - Ei/2[(v'(x,))2] 

The first inequafity is based on the fact that 
(Gs V'{X,) - Gs V\X,)){X, - Xs) 



E 



g,E 



(y'(x,)-v'(x,))(x,-x,) +(g,-g,)E v'(Xs)(x,-x, 



< f\\V"\\ooE[{Xs-Xsf] + \Gs-Gs\^^^^ [{V'{Xs)f] eV2 ^(^Xs-Xsf 

and the similar inequality obtained by exchanging X and X. Now, since, 

\E[v"{Xs) - v"{Xs)]\ < ||y(=^)||ooiE'/'[(^. - x,)% 

n{V'{X,)f - {V'{Xs)f]\ < \\V"\\^E'/\Xs - Xsf] 

X (E'/\V'{Xs)f]+E^/\V'{Xs)f] 



\E'/^[{V\Xs)f]-E^/\V'{Xs)f]\ < E^/\V'{Xs)-V'{Xs)f] 

<\\V"\UEy\X,-Xs)% 

the local boundedness of t ^ E[{V'{Xt)f] V E[{V'{Xt)f], the local inte- 
grability of t i— t- E[(Xf — Xt)'^] and Gronwall's Lemma ensure that Vt > 0, 
E[{Xt - Xtf] = 0. □ 

Remark 1. When (m, {V')'^) = +oo, we have already shown uniqueness 
of solutions to (2.3), and it is actually easy to build a strong solution. Indeed, 
since 

2 
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one has = +00 for all t>0. As a consequence (^^ + 

solves (2.3). 

3.2. Tightness of the sequence (7r")n>i. According to [23], because of 
exchangeability, the tightness of the sequence (7r")„ is equivalent to the 
tightness of the laws of the processes (y/'")t>o. Under the hypothesis of 
Theorem 1, the laws of the random variables (Xq'")„>i are tight since they 
converge weakly to m as n — )• 00. Hence the tightness is a consequence of 
the following proposition. 



Proposition 3. A ssuTTie that the laws of the random variables (-^q' )n 



>i 



are tight. Then the laws of the linearly interpolated processes processes (Y^ 
( [nt] — nt)X^^^ + {nt — [nt\ )X^^-^ , t > 0)n>i are tight in C. Moreover there 
is a finite constant C not depending on n such that 

(3.7) VI < i < n, Vt > 0, E[(T/'(y/'''))2] < 2E[(F'(X*'"))2] + C{t V t^). 

The proof is based on the following estimate, that will be proven below. 

Lemma 4. There exists a deterministic constant C < +00 not depending 
on n such that 

VI < . < n, VO < / < A:, E ((X^ - X;'-)^|jr) < C V . 



Proof of Proposition 3. Since the laws of the initial random vari- 
ables (Xq'")„>i are supposed to be tight, Kolmogorov criterion ensures the 
desired tightness property as soon as there exists a finite deterministic con- 
stant C such that 

(3.8) VO < s < t, E ((y/'" - y/'") Vo") < C{{t - sf V (t - sf). 

For t > s > with \nt\ > [ns] , using Lemma 4 for the second inequality, 
one obtains 

Ef(y/'"-y,i'")4|j-o") 

■I' 



/(/(nt- lntJ)G| ,t)4 , , mns'\ - ns)G\ 
< 27E( ^ , ^"^1^ + (X - X )^ + ^-i^ 

— ^2 I I 



<C((t-s)2v(t-s 



i4n 
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For t > s > with [ns\ = [nt\ , one has {nt — ns)^ < (nt — ns)^ and therefore 



E((y/'"-y,i'")4|j-o" 



Last, since < + \\V"\\oo\Y;''' - Fj'"!, 

E[(y'(y/'"))2 < 2E[(y'(x;'"))2] + 2||y"||^Ei/2[(y/'" - yj'")4] 

< 2E[(y'(X*'"))2] + C7(iVt2). 



□ 



The proof of Lemma 4 rehes on the following Lemma. 

Lemma 5. There exists a finite constant C not depending on n such that 
V(xi,...,x„,) gM'^, 



E 



< 



c 



n 



Proof. By Lipschitz continuity of x i— )• A 1 and the Taylor expansion 

/ /—til 



with Xi G [xi,Xi + one obtains 



E 



< E 



2n 



Developing the square and remarking that for i 7^ j, E [((G*)^ — — 1) 

= E [((G*)^ - , one easily concludes using the bounded- 

ness of and □ 



We are now in position to prove Lemma 4. 
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Proof of Lemma 4. For A; G N (/c > 1), let us denote by the condi- 
tional expectation given For A; G N, k > k, one has 



fc+l<fcl,fc2,fe3,fc4<fc V=l 



3.9 = — ^ , + -J ri,i,i,i + r2,i,i + Ts,! + r2,2 + 

where the sum has been separated into five disjoint terms: 

• ^1,1,1,1 corresponds to the restriction of the summation to indexes ki, 
^2, k^ and k^^ taking distinct values, 

• ^2,1,1 to the restriction to indexes such that the cardinality of {ki ,k2,k-i, A;4} 
is equal to 3, 

• Ts^i to three indexes equal and the last one different, 

• ^2,2 to two pairs of equal indexes taking different values, 

• T4 to four equal indexes. 

One has 

T4 + r2,2 + n,i <(k- k)E{{G\)^) + 3(k - k)(k -k- l)E{{G\fiGlf) 
+ 4(k - k)(k - k- l)E{\G\f)E\Gl\ 
2 16(k-k)(k-k-l) 



(3.10) =3{k-ky + 

Let us now estimate Ti^i^i^i and T2^i.i. For fixed ki, k2, k^ and k^ (four inte- 
gers in {k+1, . . . , k}), let us define {X^/^^, k > l)i<i<n such that {X^'"', . . . , - 
(Xq'", . . . , Xq and, for 1 < i < n, 

V"- \Uk+i<e * ^ ^ ^ 

Let us also denote by J- the sigma-field generated by these processes which 
are exchangeable, independent of (C/^, {G\, . . . , G'^))k<^{kx,k2MM} ^^-"^ equal 
to the original processes {Xj^^,k > l)i<i<n on the event 
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Of course, J-p C J-. When the indexes ki,k2, and A;4 are distinct (namely 
for Ti^i^i^i), by conditional independence of the vectors ((CL , . . . , G?,, Uk ))i<j<4: 
given F, one has 



By Cauchy-Schwarz inequality and Lemma 5 above, 



T 



1 _ e J:"=i(7^^'(^^;-i)G'.>||v"'{x;- J) 



J" 



E 



^1 / EIL,(nx-,)-m-,+^c^j)^^ 



— e 



A 1 



T 



(3.11) < ^eV2[(g1;2|^] < ^. 



Applying (A.4) with a = (X.^f. J|, /3 = Er=2(^0^(^If-i) and 

7 = — 1^ Y^=\ ^"(^fc"_i)i then using exchangeability, one obtains 



Ei 



< 



E Gi, 1-e 



J" 
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With (3.11) we deduce that 

C 



kj 



< 

'n 



which imphes that Ti^i^i^i < (^^4") • Concerning T2^i^i, we proceed similarly 
and bound the conditional expectation of the squared term by Efc((G^ ,)^) = 

1 to derive T2^i^i < ^('^3~)(2)- concludes by plugging these estimates 
together with (3.10) in (3.9). □ 

3.3. Identification of the limits of converging subsequences of ('7r")„>i. 
Let 7r°° denote the limit of a converging subsequence of (vr")„ that we still 
index by n for notational simplicity. We want to prove that 7r°° gives full 
weight to the solutions of the nonlinear martingale problem (MP) (see Def- 
inition 2). To do so, for ip G Cf(M), p G N, : M*' — )■ M continuous and 
bounded and < si < S2 < ■ ■ ■ < Sp < s < t, we define 

F : Q G V{C) ^ (q, {ifiYt) - if{Y,) - £ LQ,(^(y,)dr) g{Y,, , . . . , y.,; 

To prove that 7r°° gives full weight to the solutions of {MP), it is enough to 
check that E'^°°|F((5)| = 0. Indeed, taking Lp, g, si,. . . ,Sp,s,t in countable 
dense subsets, one deduces that 7r°° a.s., Q solves (MP). In Section 3.3.1, 
we present the main steps of the proof. Then, in Sections 3.3.2 and 3.3.3, 
we provide the proofs of crucial propositions used in Section 3.3.1. 

3.3.1. Proof of K'^°°\F(Q)\ = 0. By combining the two next propositions, 
one first obtains the asymptotic behavior of E'^" [-^(Q)! = E|F(/i")| as n — >• 
00. 



Proposition 4. Assume the existence ofuQ finite such that sup^^^^ ¥,[{V' {Xq'"'))"^] < 
+00. Let 

1=0 

+ l:E¥'''(^r)((Gw)'w. -E[(q+i)'wj-^]). 



2n 



Then, for s < t, lim„_5.oo supx<j<,„ E 



0, where ^" denotes the marginal at time r of pJ'' (defined by (3.1)J. 
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Proposition 5. Assume the existence oJuq finite such that sup„>^g E[(F'(Xq''^)) ] < 
+00. Then 



lim E 

n—>-oo 



' n N 
i=l / 



Since = \ ELi " ^{y^ " /; WMyndr) giY^T, • • • , YsT), 

one has 



E|F(^")| <- 



n 



i=l 



+ 



lEi/2 



One deduces that 

(3.12) Hm E^"|F(g)| =0. 

n— ^00 

Since (7, ^, T and V'lp' are bounded, the function F is bounded. Unfor- 
tunately, when V' is not bounded, the lack of continuity of // € 'P(M) 1— t- 
(/Lt, (1^')^) implies that F is not continuous and the weak convergence of vr" 
to 7r°° does not directly ensures that K'^°°\F{Q)\ = 0. 

To overcome this difficulty, for G N, we introduce the second order 
differential operator defined like in (2.4) but with {fi, {V')^ A k) re- 
placing (/i, (T^')^)- We also define -Ffc like F but with Lq^ replaced by Lq^. 
The functions -F^ are uniformly bounded and converge pointwise to F by 
the properties of G and T stated in Lemma 2. Moreover, F^ is continuous. 
Indeed, to deal with the discontinuity of Q at (0,0), it is enough to remark 
that for z^, /i G 



(i., \gi{,^,iV'f A A;), (z., V")) - Giif,, {V'f A k), (/., y"))| X \V'^'\) 

<l{(/i,(V")2Afe>>0} ||l/V'lloo|a((z^, (T/')' A A;), (z., y")) - g((/i, {V'f A A;), (/i, 0)l 

+ l{(At,(W)^Afc>=0}2^^(l^ - ^, I^VDi 

where we used in the last line the fact that l{(^,{y)2Afc)=o}(/^; I^VI) = 0- 
As a consequence, 

E^"|F(Q)| = lim E^"|Ffc(Q)| = lim lim E^"|Ffc(Q)| 

< limsuplimsupE^"|Ffc(Q) -F(Q)| 
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where we used (3.12) for the inequahty. One concludes by the next proposi- 
tion : 

Proposition 6. Assume the existence of uq finite such that the random 



variables ((^'(^o''^))^)n>no <^^^ uniformly integrable. Th 



en 



hm sup E\Fkifin) - F{fin)\ =0. 

This Proposition concludes the identification of the limit as the solution 
to (MP), and thus concludes the proof of Theorem 1. Indeed, under the as- 
sumptions of Theorem 1, there exists no finite such that sup„>„(j E[(y'(XQ''^))^] < 
-|-oo. Moreover, the convergence in distribution of the non-negative random 
variables ((^'(^o'"))^)n>no to mo ((F')^) combined with the convergence 
of their expectation to the one associated with the limiting distribution en- 
sures that they are uniformly integrable. 

3.3.2. Proof of Proposition 4- This Section is devoted to the proof of 
Proposition 4. As already pointed out in Section 2.3, the main difficulty is 
the identification of the drift term. 

Proof of Proposition 4. One has dy/'" = lV^G^^nt']'^-A\nt-\ ■ ^ 
sequence, 

^{Yn - ^{Yin = l'iv^^'iYr)G\^r]U,„^A. 

Using the Taylor expansion 

^'(Yn = ^'(XlZ^) + /(X;;",j)(nr - [nr\)±G}^^^U^^^^ 



with Xr" G [^[^j 5 Yr'"^] , one deduces that 

\>^^){^r^){nr-[nr\f{G\^,.^fU^^^^dr 



JL 

l'^{ns — Yns\){\ns'\ — us) „ 
2n 

f'{nt - [ntj)([nt] - nt) „ 



+ 



2n 
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By the boundedness of and (p^^^ , one easily concludes that 
(3.13) 
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E 



, dr 



< 



C 



n 



Combining the error bound in Lemma 5 with (A. 5), it will be easy to identify 
the diffusion coefficient from the term E V^u)"{xi''' ,){G\ . )'^1a, , IJ'I' 

as explained below. To identify the drift coefficient from E ly/nip' {X^^^^^ )G}nr] ^-^inr] \-^nr\ 
and (A. 3), we need a more precise estimate. 



Proposition 7. Let x = (xi, . . . , x„) G and i^n = ^ Ya=i ■ There 
exists a finite constant C not depending on n and x such that 

E ( fe^"^i(^(-)-^(-+;^«')) A llU {Vr)A^n, V")) 



< C 



l + \V'{x{)\ , \V'{x,)\ 



+ 



+ 



\V'{x^)\'l'' 



n 



n3/4(i.„,(y')2)V4 n3/4^(z.„,(y')2) / • 



Let us admit this Proposition, and conclude the proof of Proposition 4. 



(3.14) 

rpi,n . rpi,n , rpi,n . rpi,n rpi,n 

where 



rpl,n 



, dr 



L,Myr)dr-{M^Z^-M\Z^ 



\ n n 



rpl,n 



L^n^(y;'")-L^.^^j(^(y^)dr, 



n 



/2([nt] - nt) „ 



2n 
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and 



+ 



n 



l'^{\ns\ - ns) „ i. 



2n 



c 



n 



The boundedness of 99' and y?" implies that 
(3.15) E(|T]'"| + iTg^'"!) < 

By Proposition 7, Holder's inequality and the equality 



deduced from exchangeability, one obtains 
(3.17) 

^t i + Einyff^^jl E3/4(|y^(yg )p/3) EV^|y-(i^-^j| 

< C / ^-^^ — 1 , 1 — dr 



n 



n 



1/4 



1/4 



Concerning Tg'", by Cauchy-Schwarz inequality and Lemma 5, one easily 
checks that 



< 



c 



n 



where, by (A. 5) and (A. 6), 



E 



l{i=i} 



^r((^;f^,(y')'),(/iTnH,n) 



(3.18) 



+ 



y'(xf'",)y'(x/'",) , 



n 



1/ ,v">)2 

J \7ir\ j ' ' 

' 2(^7- I , .(V')2> 
' [nrj/n ^ ' 
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(We will need this expression for i ^ j below). With the boundedness of Q 
and (3.16), this implies that 

(3.19) Eirri < -^ + - tniy\Yl':^,)f]+E'/\V\YlC,)f]dr. 

To deal with Tg'", one remarks that by exchangeability, boundedness of 
g, ip' and (1^V')'> then by (3.5) 



E 



By exchangeability E|«-Mf„,,j/„, r') I < ||^E|y,*'" - y^^j/J. More- 
over, llC'" — Y,*'"", , I < -^IG} n |. Dealing in the same way with the diffusion 
term by boundedness of F and ip^^^ and (3.3), one deduces that 

(3.20) 



Jn 

One has 



EK/i;^-/^f„.j/„,(n'>i 

< ^2\\v"UY}i\Y\Yr)f + (y'(y^j/J)^]Ei/2[(y;'" - y^^^j 

(3.21) 



< -^EV2[(y'(y;>n))2 + (^'(y^j/J)^]. 



Plugging this inequality in (3.20) and inserting the resulting inequality to- 
gether with (3.15), (3.17) and (3.19) into (3.14), one concludes with (3.13) 
and local boundedness of r i— )• sup„>„jj E[(y'(yr '"'))^] deduced from (3.7) 
and the assumption sup„>^p E[(y'(XQ"))^] < +oo. □ 
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To prove Proposition 7, we need the following lemma : 

Lemma 6. Let X, Y denote two real random variables with respective cu- 
mulative distribution functions Fx and Fy and / : M — ?• M 6e a bounded func- 
tion, Lipschitz continuous with constant L[f) outside [— e, e] for some con- 
stant e > 0. If X admits a bounded density px with respect to the Lebesgue 
measure on R, then 

\E[f{X)] - E[/(y)]| < L{f)Wi{X,Y) 

+ 2(sup/-inf/) (x/2||px||ooVFi(X,y) + ||px||ooe) 

where W\(X,Y) = inf E\Z — W\ denotes the Wasserstein 

distance between the laws of X and Y . 

Proof. Let for u G (0,1), F^^{u) = inf{x G M : Fx{x) > u} denote 
the cag pseudo-inverse of Fx and Fy^ be defined in the same way. Then, 
Vx G M, Vti G (0, 1), F^^{u) < X <^ u < Fx{x). Moreover, if U is uniformly 

distributed on [0, 1], then F^^{U) = X, Fy^{U) = Y and according to [19, 
p. 107-109], Wi{X,Y) = E\F^^{U) - Fy\U)\. As a consequence. 



\E[f{X)] - E[f{Y)]\ = miF^\U)) - f(Fy\um 

< \E[ifiF^Hu)) - f{FyHum\^-.^u).p-.^u)<-e} + hF-Hu)AF-Hu)>s})]\ 

+ \E[{f{F^HU)) - fiFyHUm\p^.^u)<-e<F-Hu)} + hF~HU)>s>F-\U)})]\ 
+ \E[{f{F^\U)) - /(i^y^(f/)))l|„,<^-i(^)<,}]| 

< L{f)E\F^\U) - Fy\U)\ 

+ (sup / - inf /)(p(Fy(-e) <U< Fx{-e)) + FiFx{e) <U< Fy(e)) 

+ ¥{Fx{-e) <U <Fx{e))^ 

= L{f)Wi{X,Y) 

+ (sup / - inf/) (^(Fxi-e) - Fy(-e))+ + (Fy(e) - Fx{e))^ + px{x)dx 

One concludes by using the inequality 

sup \Fx{x) - Fy{x)\ < ^2\\px\\ooWi{X,Y). 

This inequality is stated in [14, Lemma 5.4] with the factor 2 replaced by 4 
but a careful look at the proof of this lemma shows that it holds with the 
factor 2. □ 



OPTIMAL SCALING FOR THE TRANSIENT PHASE OF RWM 



27 



Proof of Proposition 7. The proof is inspired from [14, Section 5], 
where the authors first replace V{xi) — V{xi + -^G^)) by —-^G^ in the ex- 
ponential factor at a cost C(^)- Then they explicitly compute the conditional 
expectation given (G^, . . . , G") to improve the regularity of the function in 
the expectation. Next they replace Y17=2i^i^i + "7^^*) ~ ^(^«)) 

Gaussian random variable ^^=2 ^^^^^^ ^^ + ^~^^2n^) control the result- 
ing error by some Wasserstein distance estimate between these two random 
variables. To preserve symmetry in the estimate and in particular to obtain 
{I'n, (^0^) instead of ^ X]"=2(^'(^«))^ ™ denominators, we write Gi as 
the sum of two independent variables distributed according to Af{0, ^). 



Let = ^, = for i > 2 and G^ ~ M{0, ^) be independent from 
(G^...,G"). One has 



By Lipschitz continuity of y i— )• A 1 and boundedness of V" , one deduces 
(as in Lemma 5) that 




-1 fvi^i)-vi=^i+-k(G^+G'))+j:7=2iyM-yi^^+TkG')) 





1 (^J:UiVM-V{x, + ^G')) 



A 1 



)) 




where, by conditioning by (G 



. . . , G") and using (A.3) 



E = 2E i^G^ 





) 



Let 



n , 

1=1 ^ 





/2 1^ 



i=l 



n 
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withxi e [xi,Xi+^^G'']. By boundedness of F" and y^^^ and since £[(((5*)^— 

1)((GJ)2 - 1)] = as soon as j / « and E[V'^^\xi){&fi{&? " 1)] = as 
soon as j ^ {l,i}, E[{X -Yf] < % which implies that Wi{X,Y) < The 

density of X is bounded by (/^7r(z/„, (1/')^))"^/^. By Lemma 2, the function 
Q takes its values in [0,/^]. Moreover 

a.e(«,i-) = 4e(<.,(,) + ^.-* 

which ensures that sup^^ ^).|j|>^i/4 \dbG{a,b)\ < +oo. Lemma 6 applied with 



(2n)l/4 



implies that 



E 



E 



{V'jxi))^ 2X 
2n ' /2 



C 

< ^ + 



+ 



n (n(^„„(y')2))V4 ni/4^(z.„,(y/)2)' 

where C does not depend on x nor on n. One concludes by remarking that, 
by (A.7), 



E 



2n ' P J 



g(K,(y')'>,K,0)- 



□ 



This concludes the proof of Proposition 4. 



3.3.3. Proofs of Propositions 5 and 6. Finally, it remains to prove Propo- 
sitions 5 and 6. 



Proof of Proposition 5. Since for I <i <n, {M]f^) is a J"^-martingale 
and g{YsC ■> • • • ) Ys'J^) is -measurable, one has 



E 
(3.22) 



n \' 

i=l J 



1 



n \nt\-l 



J] 5] E E {Mi:l,-Mi:^){M^^l,-Mn\H giY:^,...,Yl^)g(Y^r 



*J=1 k=\ns\ 
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Using the boundedness of and (/?", then Lemma 5, (3.18) and the equahty 



E 



k+l 



deduced from (A. 3), one obtains 



E 



E 



— j^3/2 ^ 

< £ 

— j^3/2 ji 



ELi(^'(4'")7^Gi+i+l|v"{4'")) 



E 



-E 
X E 



A 1 IJT 



< c 



3/2 



n 



Plugging this estimate into (3.22) and using (3.16), one concludes that 



E 



. i=l 



< c 



\nt] — Ins] 1 



lnt]-l 



n 



3/2 



+ [niy'iYipf] + ^n{v'{Yipf 



k= [ns] 



One concludes with the local boundedness of r 1— )• sup,„>„|j E[(y (1^' )) ] 
deduced from (3.7) and the assumption sup„>„ E[(F'(X^'"))2] < +00. □ 

Proof of Proposition 6. Since the function is compactly supported 
and V is continuous, one may suppose that k is large enough so that 
Vx G M, \V'p\x)\ < W^'W^y^iV'lx))^ Ak and therefore {fi"^ ,\V'ip'\) < 
\\^'\\oo\/ (/"r ! i^'y A k). By boundedness of 5 and {p" , then using (3.2), (3.3) 
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and (3.5), one deduces 

, (^0' A A:), {f,^,V")) - ri/2((/.;f, (V'f), {f^^, V"))\ 



E|Ffc(/i")-F(^")| <C / E 

J s 



+ (Vf A k), if,-, V")) - Giif,:^, ivf), (/^^ v"))\V{^?,{v'y^k) 



dr 



< C / n^MWY^W) + (/^", {{V'f - k)^)]dr. 



(3.23) 



Since |T^'(r/'")| < \V'{xl^'')\ + - l^o^'"| and using (3.8) and the 

Markov inequality, one obtains that for r € [0, t] , 

E[(/x;!, {{V'f - k)+)] < E [{V'{Y,'n)\\v'iY^^^-^\>V-k} 



< 2E 

< c(e 

+ 



(y'(Xo^'"))2 + ||y"||^|y/'«-yo^'"|2) (1 



+ 1, 



^ 1 , n -y^ 1 . n I \ y/k 



(^'(^0'"))^l||y,(^l,n)|>v^| 



A;2 



E[(y'(Xoi'"))2] + 



2 



+ (tVt2)P[|l/'(Xo^'")| >Vk/2] 



Therefore, when the random variables ((y (Xq'"'))^)„>„q are uniformly in- 
tegrable, 

(3.24) 



lim sup sup EKn"^, {{V'f - k)+)] = 0. 

k^oo n>no r£[0,t] 



One concludes by plugging this result into (3.23). 



□ 



3.4. Proof of Proposition 1. We end this Section by a proof of Proposi- 
tion 1 about the limiting average acceptance ratio. 

Proof of Proposition 1. By Lemma 5 and [22, Proposition 2.4] which 
is also a consequence of (A. 5) for the choice a = 0, there is a finite deter- 
ministic constant C not depending on t such that 



1 



< 



c 



n 
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E 



With (3.3), one deduces that 
1 

[nt\ I ~ J2 

(3.25) 

< C f ^ + (E + E^/^ 



ntj + ll-^n 



n 



T{E[iV'{Xt))%E[V"iXt)]) 



+ E 



{f,U,V")-E[V"iXt)] 



One has for A; G N, 



E 



{^,Udv'f)-mv'{x,)y 



< E 



LntJ 



/^2\ 



n n 

+ E K/i^, {V'f Ak)- E[{V'{Xt)f Ak]\+ E[{{V'f - k)+{Xt)]. 

By the end of the proof of Proposition 4 (see in particular (3.21)), the first 
term in the right-hand-side converges to locally uniformly in t as n — t- oo. 
By (3.24) and Theorem 1, the sum of the second and last terms in the right- 
hand-side converges to as /c — )■ oo uniformly for n > uq and locally uni- 
formly in t. Last, for fixed k, the third term converges to as n — ?• oo locally 



uniformly in t by Theorem 1. One deduces that E 



/\2\ 



mv'{Xt)f 



converges to as n — ?• oo locally uniformly in t. Dealing with the other ex- 
pectation in the right-hand-side of (3.25) in a similar but easier way (since 
V" is bounded), one concludes the proof. □ 

APPENDIX A: PROOFS OF TECHNICAL LEMMAS 

In this section, we first give a proof of Lemma 2 which gives basic proper- 
ties of the functions F and Q. Then, we give some explicit formulas for some 
expectations involving Gaussian random variables. 

Proof of Lemma 2. The functions Q and F are clearly continuous on 
(0, -|-oo) X M. We recall the usual tail estimate for the Normal law: \/x > 0, 



(A.l) 



+°° -2 dy 



e 2 



< 



y ^yl dy 
— e 2 



e 2 



27r x\/27r 



One deduces that for a > 
(A.2) 



2^ 



a < 



l^f2 



vra 



iC 8a and ^(a, 6) < 



21 



^/2 



vro 



Since for < a < 6, Q{a, 6) < x 1 x 1, one deduces (3.4). Moreover, (A.2) 
implies that Q is continuous on {(0, +oo] x M} U {{0} x (-oo, 0)}. With the 



32 B. JOURDAIN, T. LELIEVRE AND B. MIASOJEDOW 

continuity of (a, 6) i— on (0, +oo] x R under the convention = 0, one 
deduces that T is continuous on (0, +oo] x R. 

For /? > 0, hm^^Q+ $ ~ a/^) = 1 and therefore hma^o+, 6-^/3 S{a, b) 

G{0,f3), which completes the proof of the continuity properties of Q. 



Since for {a,b) £ (0,+oo)xM, dbT{a,b) = -^e^^$ [I \^-^ - y/ajj < 0, 

for fixed a G (0, +oo), the function b i— t- T{a,b) is decreasing. One easily 
checks that for fixed 6 < 0, lim^_^Q+ r(a,6) = P + = r(0,6) and fixed 

b > 0, lima_^o+ r(a, b) = + l'^e~~ = r(0, b). With the previous monotonic- 
ity property, one deduces that lima_i.Q+ r(a, 0) = P = r(0, 0). The continuity 
of 6 I— 7- r(0, 6) and Dini's lemma implies that b i— )■ T{a,b) converges locally 
uniformly to 6 i— )■ r(0, 6) as a — )• 0+ and that T is continuous on [0, +oo] x M. 
Since T is positive on [0, +oo] x M, one deduces that (3.2) holds. 

For a > 0, by (A. 2), limb_,_oo 6?(a, 6) = 0. Since limb_,_oo ^ (-37^) = 1> 
one deduces that limb^_oo r(a, b) = P. By monotonicity of b ^ r(a, 6), one 
deduces that V(a,6) e (0,+oo) x M, T{a,b) < P. This bound stih holds for 
a G {0, +00} by continuity (or using the explicit expression of F). 
For (a, 6) G (0, +00) x M, one has 



2 



dbr{a,b) = --g(a,b) 

dar{a,b) = ^^g{a,b) ^ 



2 2^/2^ 

/2 . P 



dMa,b) = --gia,b) + ^ 

P / I b 
daGia, b) = -G{a, b) - — = + 



Since ^ and (a, 5) G [0, +00] x M 1— )• ^^e~~s^ are bounded and using (3.4), 
one easily checks (3.3) and (3.5). Let us give some details for the inequality: 

iV^AV^)\gia,b)-g{a',b)\ < C{\a' -a\ + \V^-V^\). 
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Let us assume that < a < a' . Then we have: 

^ r' 

{^/aA^Al')\g{a,b)-g{a,b)\ = ^/^ / daG{x,b)dx 

J a 



a' i2 



e 8^ ax 



2 17^ + 2x3/2 



< CV^ I { + - ] dx < C i (a' - a) + r ^ dx 



IX X 



X 



<C\{a - a 



+ ^ dx^ < C ((a' - a) + (^fd - V^)) 



□ 



Lemma 7. For a G [0,+cx3), a, (3,^,6 e M and independent normal 
random variables G, G and G, one has 



7 + a'+/32 



(A.3) 



a^/a^ + P^ 27 



(A.4) 



E ( G ( 1 - e^^+'^^+^y 



a2 + /32 ^ 



7 \ 



(A.5) 



7 + + 
" Va2 + /32 

a' 



^Ja^ + /32 y v^27r(a2 + /32) 



(A.6) 



I c2+^2+^2 / ^ + a2+/32 + 5: 



Va2 + /32 + ^2 / v/27r(a2 + /32 + ^2) I ' 
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(A.7) E(g(a,aG + /3)) = gfa + ^,/3 

Proof. In this proof, the identity E(/(G)e"'^~'^'/2) ^ E(/(a + G)) is 
repeatedly used. Let us start with (A. 3). By the symmetry of the normal 
law, a I— 7- E (^G ^e"'-''^^^^'^ ^^^^ function and we only need to 

check (A. 3) for a > 0. Conditioning by G for the third equality, we get 

E ( G (e"^+^^+^ A l)) = E (e^+4e"«-4e/3^Gl^^<_^^ + Gl^^^^^ 

(7+;3G) ^ 

+ ^=E ( e ^ 



We deduce (A. 3) by remarking that the two last terms compensate each 
other since 



7 + — + /3G 



2 2a2 2a2 

To obtain the inequality (A. 4), we notice that 

E f G f 6"^+^^^+^ A l) ) = E f G f e"«+'^^+^ A l) ) - E(G) 



-E(G(l-e°^+^^+^) ' )=- " X + 



and conclude using (3.4). To derive (A. 5), one obtains by conditioning by G 
for the second equality 



E 



(g2 [e'^G+pG+^i ^ 1 



e— +-E { (G^ + 2aG + a^)$ ( + + + ^' "l U IE f + 



1^1 J J \ \ \P\ 
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By integration by parts, 



j + aG\\ 1 



m J 



a 



7 + ax \ , 



xe 



'^^dx 



|/3|G-aG<7 + 



2vr|/3| 



xe 



2/3^ 



(A.9) 

and 

(A.IO) 



7 + qG + + 



2 9 

v/2^(a2 + /32)3 



2^|/3| 



V27r(a2 + /32) 



g 2(c«^+/3^) 



One obtains (A. 5) by plugging this last equality together with (A.9) also 
written with (a, 7) replaced by {—a, —(7 + a"^ + /3'^)) in (A. 8). 

To prove (A. 6), conditioning by G, using the first assertion and then using 
(A.IO), one obtains 



E (GG fe"«+/^^+^^+^ A 1 n = ae^+^E Ge'"^^ 



j + 6G + a^ + P^ 



ae 



1 



E (G + 5)$ 



7 + (5G + + /32 + 52 



7 + a2 + /52 + 52\ 



g 2(a2+/32 + ^2) 



Last, 



-^E (g(a, aG + /3)) = e 2 



v/a2 + /32 + 52 y v/27r(a2 + /32 + 52) 
aG - /2a2/2 + /3 



G < 



2^ 



iha+i^f/i^P) ^^ /^^/a + ;2a2 /4 /3 - 2(a + /2a2/4) 

(jr I- 



2^ 



which yields (A. 7). 



□ 
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